Welcome on this notebook, dedicated to our python project.
Reminder of the project : From a set of data, carry out a complete study with visualization and machine learning algorithms in order to explain the different links existing between the variables of the dataset.
Perform your visualization study on a Jupyter notebook and offer a Flask API to visualize and create one of the best prediction models you will find, where a user can choose the parameters suitable for the model.
Note I : The main purpose here is to estimate the number of comments that a Facebook message should receive in the hours following its post. The number of comment is modelized by the column "Target Variable" in the datasets. The analysis will be done first by studying through different graphs user behavior and the trends that stand out the most. Secondly, you will find the Machine Learning part where we perform an ACP and then we implement some prediction algorithms using regression techniques.
Note II : [For the Prediction part] The Flask API is enable on the same directory and can be run when using the python command : python MyFlaskApp.py in your terminal.
#!pip install plotly
#!pip install pygal
#!pip install wordcloud
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
import pygal
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
from PIL import Image
from collections import Counter
# interactive visualization
import plotly.express as px
datafb1 = pd.read_csv("C:/Users/alixp/OneDrive/Bureau/ESILV/A4/Python/Features_Variant_1.csv",index_col = None,header = None)
datafb2 = pd.read_csv("C:/Users/alixp/OneDrive/Bureau/ESILV/A4/Python/Features_Variant_2.csv",index_col = None,header = None)
datafb3 = pd.read_csv("C:/Users/alixp/OneDrive/Bureau/ESILV/A4/Python/Features_Variant_3.csv",index_col = None,header = None)
1) Fast inspection of the dataset :
datafb1.head(5)
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 634995 | 0 | 463 | 1 | 0.0 | 806.0 | 11.291045 | 1.0 | 70.495138 | 0.0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 1 | 634995 | 0 | 463 | 1 | 0.0 | 806.0 | 11.291045 | 1.0 | 70.495138 | 0.0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| 2 | 634995 | 0 | 463 | 1 | 0.0 | 806.0 | 11.291045 | 1.0 | 70.495138 | 0.0 | ... | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 3 | 634995 | 0 | 463 | 1 | 0.0 | 806.0 | 11.291045 | 1.0 | 70.495138 | 0.0 | ... | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | 634995 | 0 | 463 | 1 | 0.0 | 806.0 | 11.291045 | 1.0 | 70.495138 | 0.0 | ... | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
5 rows × 54 columns
2) Rename the columns of all the dataset as explained in the given document
column_names= ["Page Popularity/likes","Page Checkings","Page talking about","Page Category",
"Derived_5","Derived_6","Derived_7","Derived_8","Derived_9","Derived_10","Derived_11","Derived_12","Derived_13","Derived_14",
"Derived_15","Derived_16","Derived_17","Derived_18","Derived_19","Derived_20","Derived_21","Derived_22",
"Derived_23","Derived_24","Derived_25","Derived_26","Derived_27","Derived_28","Derived_29",
"CC1","CC2","CC3","CC4","CC5","Base time","Post length","Post Share Count","Post Promotion Status"
,"H Local","Post published weekday_Sunday","Post published weekday_Monday","Post published weekday_Tuesday",
"Post published weekday_Wednesday","Post published weekday_Thursday","Post published weekday_Friday","Post published weekday_Saturday",
"base_dt_weekday_Sunday","base_dt_weekday_Monday","base_dt_weekday_Tuesday","base_dt_weekday_Wednesday",
"base_dt_weekday_Thursday","base_dt_weekday_Friday","base_dt_weekday_Saturday","Target Variable"]
datafb1.columns=column_names
datafb2.columns=column_names
datafb3.columns=column_names
3) Verification of the datasets structures
print("############ dataset 1 : INFORMATIONS ##########\n\n")
datafb1.info()
print("\n\nshape of firt dataset",datafb1.shape,)
print("\n\n\n\n############ dataset 2 : INFORMATIONS ##########\n\n")
datafb2.info()
print("\n\nshape of second dataset",datafb2.shape)
print("\n\n\n\n############ dataset 3 : INFORMATIONS ##########\n\n")
datafb3.info()
print("\n\nshape of third dataset",datafb3.shape)
############ dataset 1 : INFORMATIONS ########## <class 'pandas.core.frame.DataFrame'> RangeIndex: 40949 entries, 0 to 40948 Data columns (total 54 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Page Popularity/likes 40949 non-null int64 1 Page Checkings 40949 non-null int64 2 Page talking about 40949 non-null int64 3 Page Category 40949 non-null int64 4 Derived_5 40949 non-null float64 5 Derived_6 40949 non-null float64 6 Derived_7 40949 non-null float64 7 Derived_8 40949 non-null float64 8 Derived_9 40949 non-null float64 9 Derived_10 40949 non-null float64 10 Derived_11 40949 non-null float64 11 Derived_12 40949 non-null float64 12 Derived_13 40949 non-null float64 13 Derived_14 40949 non-null float64 14 Derived_15 40949 non-null float64 15 Derived_16 40949 non-null float64 16 Derived_17 40949 non-null float64 17 Derived_18 40949 non-null float64 18 Derived_19 40949 non-null float64 19 Derived_20 40949 non-null float64 20 Derived_21 40949 non-null float64 21 Derived_22 40949 non-null float64 22 Derived_23 40949 non-null float64 23 Derived_24 40949 non-null float64 24 Derived_25 40949 non-null float64 25 Derived_26 40949 non-null float64 26 Derived_27 40949 non-null float64 27 Derived_28 40949 non-null float64 28 Derived_29 40949 non-null float64 29 CC1 40949 non-null int64 30 CC2 40949 non-null int64 31 CC3 40949 non-null int64 32 CC4 40949 non-null int64 33 CC5 40949 non-null int64 34 Base time 40949 non-null int64 35 Post length 40949 non-null int64 36 Post Share Count 40949 non-null int64 37 Post Promotion Status 40949 non-null int64 38 H Local 40949 non-null int64 39 Post published weekday_Sunday 40949 non-null int64 40 Post published weekday_Monday 40949 non-null int64 41 Post published weekday_Tuesday 40949 non-null int64 42 Post published weekday_Wednesday 40949 non-null int64 43 Post published weekday_Thursday 40949 non-null int64 44 Post published weekday_Friday 40949 non-null int64 45 Post published weekday_Saturday 40949 non-null int64 46 base_dt_weekday_Sunday 40949 non-null int64 47 base_dt_weekday_Monday 40949 non-null int64 48 base_dt_weekday_Tuesday 40949 non-null int64 49 base_dt_weekday_Wednesday 40949 non-null int64 50 base_dt_weekday_Thursday 40949 non-null int64 51 base_dt_weekday_Friday 40949 non-null int64 52 base_dt_weekday_Saturday 40949 non-null int64 53 Target Variable 40949 non-null int64 dtypes: float64(25), int64(29) memory usage: 16.9 MB shape of firt dataset (40949, 54) ############ dataset 2 : INFORMATIONS ########## <class 'pandas.core.frame.DataFrame'> RangeIndex: 81312 entries, 0 to 81311 Data columns (total 54 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Page Popularity/likes 81312 non-null int64 1 Page Checkings 81312 non-null int64 2 Page talking about 81312 non-null int64 3 Page Category 81312 non-null int64 4 Derived_5 81312 non-null float64 5 Derived_6 81312 non-null float64 6 Derived_7 81312 non-null float64 7 Derived_8 81312 non-null float64 8 Derived_9 81312 non-null float64 9 Derived_10 81312 non-null float64 10 Derived_11 81312 non-null float64 11 Derived_12 81312 non-null float64 12 Derived_13 81312 non-null float64 13 Derived_14 81312 non-null float64 14 Derived_15 81312 non-null float64 15 Derived_16 81312 non-null float64 16 Derived_17 81312 non-null float64 17 Derived_18 81312 non-null float64 18 Derived_19 81312 non-null float64 19 Derived_20 81312 non-null float64 20 Derived_21 81312 non-null float64 21 Derived_22 81312 non-null float64 22 Derived_23 81312 non-null float64 23 Derived_24 81312 non-null float64 24 Derived_25 81312 non-null float64 25 Derived_26 81312 non-null float64 26 Derived_27 81312 non-null float64 27 Derived_28 81312 non-null float64 28 Derived_29 81312 non-null float64 29 CC1 81312 non-null int64 30 CC2 81312 non-null int64 31 CC3 81312 non-null int64 32 CC4 81312 non-null int64 33 CC5 81312 non-null int64 34 Base time 81312 non-null int64 35 Post length 81312 non-null int64 36 Post Share Count 81312 non-null int64 37 Post Promotion Status 81312 non-null int64 38 H Local 81312 non-null int64 39 Post published weekday_Sunday 81312 non-null int64 40 Post published weekday_Monday 81312 non-null int64 41 Post published weekday_Tuesday 81312 non-null int64 42 Post published weekday_Wednesday 81312 non-null int64 43 Post published weekday_Thursday 81312 non-null int64 44 Post published weekday_Friday 81312 non-null int64 45 Post published weekday_Saturday 81312 non-null int64 46 base_dt_weekday_Sunday 81312 non-null int64 47 base_dt_weekday_Monday 81312 non-null int64 48 base_dt_weekday_Tuesday 81312 non-null int64 49 base_dt_weekday_Wednesday 81312 non-null int64 50 base_dt_weekday_Thursday 81312 non-null int64 51 base_dt_weekday_Friday 81312 non-null int64 52 base_dt_weekday_Saturday 81312 non-null int64 53 Target Variable 81312 non-null int64 dtypes: float64(25), int64(29) memory usage: 33.5 MB shape of second dataset (81312, 54) ############ dataset 3 : INFORMATIONS ########## <class 'pandas.core.frame.DataFrame'> RangeIndex: 121098 entries, 0 to 121097 Data columns (total 54 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Page Popularity/likes 121098 non-null int64 1 Page Checkings 121098 non-null int64 2 Page talking about 121098 non-null int64 3 Page Category 121098 non-null int64 4 Derived_5 121098 non-null float64 5 Derived_6 121098 non-null float64 6 Derived_7 121098 non-null float64 7 Derived_8 121098 non-null float64 8 Derived_9 121098 non-null float64 9 Derived_10 121098 non-null float64 10 Derived_11 121098 non-null float64 11 Derived_12 121098 non-null float64 12 Derived_13 121098 non-null float64 13 Derived_14 121098 non-null float64 14 Derived_15 121098 non-null float64 15 Derived_16 121098 non-null float64 16 Derived_17 121098 non-null float64 17 Derived_18 121098 non-null float64 18 Derived_19 121098 non-null float64 19 Derived_20 121098 non-null float64 20 Derived_21 121098 non-null float64 21 Derived_22 121098 non-null float64 22 Derived_23 121098 non-null float64 23 Derived_24 121098 non-null float64 24 Derived_25 121098 non-null float64 25 Derived_26 121098 non-null float64 26 Derived_27 121098 non-null float64 27 Derived_28 121098 non-null float64 28 Derived_29 121098 non-null float64 29 CC1 121098 non-null int64 30 CC2 121098 non-null int64 31 CC3 121098 non-null int64 32 CC4 121098 non-null int64 33 CC5 121098 non-null int64 34 Base time 121098 non-null int64 35 Post length 121098 non-null int64 36 Post Share Count 121098 non-null int64 37 Post Promotion Status 121098 non-null int64 38 H Local 121098 non-null int64 39 Post published weekday_Sunday 121098 non-null int64 40 Post published weekday_Monday 121098 non-null int64 41 Post published weekday_Tuesday 121098 non-null int64 42 Post published weekday_Wednesday 121098 non-null int64 43 Post published weekday_Thursday 121098 non-null int64 44 Post published weekday_Friday 121098 non-null int64 45 Post published weekday_Saturday 121098 non-null int64 46 base_dt_weekday_Sunday 121098 non-null int64 47 base_dt_weekday_Monday 121098 non-null int64 48 base_dt_weekday_Tuesday 121098 non-null int64 49 base_dt_weekday_Wednesday 121098 non-null int64 50 base_dt_weekday_Thursday 121098 non-null int64 51 base_dt_weekday_Friday 121098 non-null int64 52 base_dt_weekday_Saturday 121098 non-null int64 53 Target Variable 121098 non-null int64 dtypes: float64(25), int64(29) memory usage: 49.9 MB shape of third dataset (121098, 54)
4) Verification of the values' datasets composition
print("############ dataset 1 : ##########\n\n", datafb1.isna().sum())
print("\n\n\n\n############ dataset 2 : ##########\n\n" , datafb2.isna().sum())
print("\n\n\n\n############ dataset 3 : ##########\n\n" , datafb3.isna().sum())
############ dataset 1 : ########## Page Popularity/likes 0 Page Checkings 0 Page talking about 0 Page Category 0 Derived_5 0 Derived_6 0 Derived_7 0 Derived_8 0 Derived_9 0 Derived_10 0 Derived_11 0 Derived_12 0 Derived_13 0 Derived_14 0 Derived_15 0 Derived_16 0 Derived_17 0 Derived_18 0 Derived_19 0 Derived_20 0 Derived_21 0 Derived_22 0 Derived_23 0 Derived_24 0 Derived_25 0 Derived_26 0 Derived_27 0 Derived_28 0 Derived_29 0 CC1 0 CC2 0 CC3 0 CC4 0 CC5 0 Base time 0 Post length 0 Post Share Count 0 Post Promotion Status 0 H Local 0 Post published weekday_Sunday 0 Post published weekday_Monday 0 Post published weekday_Tuesday 0 Post published weekday_Wednesday 0 Post published weekday_Thursday 0 Post published weekday_Friday 0 Post published weekday_Saturday 0 base_dt_weekday_Sunday 0 base_dt_weekday_Monday 0 base_dt_weekday_Tuesday 0 base_dt_weekday_Wednesday 0 base_dt_weekday_Thursday 0 base_dt_weekday_Friday 0 base_dt_weekday_Saturday 0 Target Variable 0 dtype: int64 ############ dataset 2 : ########## Page Popularity/likes 0 Page Checkings 0 Page talking about 0 Page Category 0 Derived_5 0 Derived_6 0 Derived_7 0 Derived_8 0 Derived_9 0 Derived_10 0 Derived_11 0 Derived_12 0 Derived_13 0 Derived_14 0 Derived_15 0 Derived_16 0 Derived_17 0 Derived_18 0 Derived_19 0 Derived_20 0 Derived_21 0 Derived_22 0 Derived_23 0 Derived_24 0 Derived_25 0 Derived_26 0 Derived_27 0 Derived_28 0 Derived_29 0 CC1 0 CC2 0 CC3 0 CC4 0 CC5 0 Base time 0 Post length 0 Post Share Count 0 Post Promotion Status 0 H Local 0 Post published weekday_Sunday 0 Post published weekday_Monday 0 Post published weekday_Tuesday 0 Post published weekday_Wednesday 0 Post published weekday_Thursday 0 Post published weekday_Friday 0 Post published weekday_Saturday 0 base_dt_weekday_Sunday 0 base_dt_weekday_Monday 0 base_dt_weekday_Tuesday 0 base_dt_weekday_Wednesday 0 base_dt_weekday_Thursday 0 base_dt_weekday_Friday 0 base_dt_weekday_Saturday 0 Target Variable 0 dtype: int64 ############ dataset 3 : ########## Page Popularity/likes 0 Page Checkings 0 Page talking about 0 Page Category 0 Derived_5 0 Derived_6 0 Derived_7 0 Derived_8 0 Derived_9 0 Derived_10 0 Derived_11 0 Derived_12 0 Derived_13 0 Derived_14 0 Derived_15 0 Derived_16 0 Derived_17 0 Derived_18 0 Derived_19 0 Derived_20 0 Derived_21 0 Derived_22 0 Derived_23 0 Derived_24 0 Derived_25 0 Derived_26 0 Derived_27 0 Derived_28 0 Derived_29 0 CC1 0 CC2 0 CC3 0 CC4 0 CC5 0 Base time 0 Post length 0 Post Share Count 0 Post Promotion Status 0 H Local 0 Post published weekday_Sunday 0 Post published weekday_Monday 0 Post published weekday_Tuesday 0 Post published weekday_Wednesday 0 Post published weekday_Thursday 0 Post published weekday_Friday 0 Post published weekday_Saturday 0 base_dt_weekday_Sunday 0 base_dt_weekday_Monday 0 base_dt_weekday_Tuesday 0 base_dt_weekday_Wednesday 0 base_dt_weekday_Thursday 0 base_dt_weekday_Friday 0 base_dt_weekday_Saturday 0 Target Variable 0 dtype: int64
There is no null value in the three datasets, it is a good new because all rows are exploitables.
5) Resume the published day of comment in a single column [For the visualization part only]
def day(row):
if row["Post published weekday_Sunday"] == 1 :
return 'Sunday'
if row["Post published weekday_Monday"] == 1 :
return 'Monday'
if row["Post published weekday_Tuesday"] == 1 :
return 'Tuesday'
if row["Post published weekday_Wednesday"] == 1:
return 'Wedneday'
if row["Post published weekday_Thursday"] == 1:
return 'Thursday'
if row["Post published weekday_Friday"] == 1:
return 'Friday'
if row["Post published weekday_Saturday"] == 1:
return 'Saturday'
return 'Other'
datafb1['weekday']=datafb1.apply (lambda row: day(row), axis=1)
datafb2['weekday']=datafb2.apply (lambda row: day(row), axis=1)
datafb3['weekday']=datafb3.apply (lambda row: day(row), axis=1)
6) Create new datasets with only understandable columns [For the visualization part only]
datafb1_resume=datafb1[["Page Popularity/likes","Page Checkings","Page talking about","Page Category",
"CC1","CC2","CC3","CC4","CC5","Base time","Post length","Post Share Count","Post Promotion Status",
"H Local","Target Variable","weekday"]]
datafb2_resume=datafb2[["Page Popularity/likes","Page Checkings","Page talking about","Page Category",
"CC1","CC2","CC3","CC4","CC5","Base time","Post length","Post Share Count","Post Promotion Status",
"H Local","Target Variable","weekday"]]
datafb3_resume=datafb3[["Page Popularity/likes","Page Checkings","Page talking about","Page Category",
"CC1","CC2","CC3","CC4","CC5","Base time","Post length","Post Share Count","Post Promotion Status",
"H Local","Target Variable","weekday"]]
7) Change the Pages Categorie's numbers to Page Categories labels [For the visualization part only]
category_labels = {
1: "Product/service",
2: "Public figure",
3: "Retail and consumer merchandise",
4: "Athlete",
5: "Education website",
6: "Arts/entertainment/nightlife",
7: "Aerospace/defense",
8: "Actor/director",
9: "Professional sports team",
10: "Travel/leisure",
11: "Arts/humanities website",
12: "Food/beverages",
13: "Record label",
14: "Movie",
15: "Song",
16: "Community",
17: "Company",
18: "Artist",
19: "Non-governmental organization (ngo)",
20: "Media/news/publishing",
21: "Cars",
22: "Clothing",
23: "Local business",
24: "Musician/band",
25: "Politician",
26: "News/media website",
27: "Education",
28: "Author",
29: "Sports event",
30: "Restaurant/cafe",
31: "School sports team",
32: "University",
33: "Tv show",
34: "Website",
35: "Outdoor gear/sporting goods",
36: "Political party",
37: "Sports league",
38: "Entertainer",
39: "Church/religious organization",
40: "Non-profit organization",
41: "Automobiles and parts",
42: "Tv channel",
43: "Telecommunication",
44: "Entertainment website",
45: "Shopping/retail",
46: "Personal blog",
47: "App page",
48: "Vitamins/supplements",
49: "Professional services",
50: "Movie theater",
51: "Software",
52: "Magazine",
53: "Electronics",
54: "School",
55: "Just for fun",
56: "Club",
57: "Comedian",
58: "Sports venue",
59: "Sports/recreation/activities",
60: "Publisher",
61: "Tv network",
62: "Health/medical/pharmacy",
63: "Studio",
64: "Home decor",
65: "Jewelry/watches",
66: "Writer",
67: "Health/beauty",
68: "Music video",
69: "Appliances",
70: "Computers/technology",
71: "Insurance company",
72: "Music award",
73: "Recreation/sports website",
74: "Reference website",
75: "Business/economy website",
76: "Bar",
77: "Album",
78: "Games/toys",
79: "Camera/photo",
80: "Book",
81: "Producer",
82: "Landmark",
83: "Cause",
84: "Organization",
85: "Tv/movie award",
86: "Hotel",
87: "Health/medical/pharmaceuticals",
88: "Transportation",
89: "Local/travel website",
90: "Musical instrument",
91: "Radio station",
92: "Other",
93: "Computers",
94: "Phone/tablet",
95: "Coach",
96: "Tools/equipment",
97: "Internet/software",
98: "Bank/financial institution",
99: "Society/culture website",
100:"Small business",
101:"News personality",
102:"Teens/kids website",
103:"Government official",
104: "Photographer",
105: "Spas/beauty/personal care",
106: "Video game"
}
datafb1_resume = datafb1_resume.replace({"Page Category":category_labels})
datafb2_resume = datafb2_resume.replace({"Page Category":category_labels})
datafb3_resume = datafb3_resume.replace({"Page Category":category_labels})
1) Observe the correlation between the different elements in the first dataset
sns.set_theme(style="white")
# Compute the correlation matrix
corr = datafb1_resume.corr()
# Generate a mask for the upper triangle
mask = np.triu(np.ones_like(corr, dtype=bool))
# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(11, 9))
# Generate a custom diverging colormap
cmap = sns.diverging_palette(230, 20, as_cmap=True)
# Draw the heatmap with the mask and correct aspect ratio
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=.3, center=0,
square=True, linewidths=.5, cbar_kws={"shrink": .5})
ax.title.set_text("Correlation between the different elements in the first dataset")
Interpretation :
We can observe that, for instance, the target variable is highly correlated with CC1 (The total number of comments before selected base date/time), CC2 (The number of comments in last 48 to last 24 hours relative to base date/time), CC4 (The number of comments in the first 24 hours after the publication of post but before base date/time) and CC5 (The difference between CC2 and CC3) Which is purely logical.
Then the main features, positivly correlated with target variable are the post share count and the Page talking about, which defines the daily interest of individuals towards source of the document/ Post. The people who actually come back to the page, after liking the page. This include activities such as comments, likes to a post, shares, etc by visitors to the page.
def countoccurrences(store, value):
try:
store[value] = store[value] + 1
except KeyError as e:
store[value] = 1
return
text2_dict=""
for i in datafb1_resume['Page Category']:
text2_dict+=i+' '
list = text2_dict.replace("/"," ").split(' ')
from collections import Counter
dict=Counter(list)
# create the WordCloud object
wordcloud = WordCloud(min_word_length =3, background_color='white')
# generate the word cloud
wordcloud.generate_from_frequencies(dict)
#plot
plt.imshow(wordcloud, interpolation='bilinear')
plt.title("Most represented categories by word")
plt.axis('off')
plt.show()
Interpretation :
This is the word cloud of the most represented categories. sports, teams, and professionals are the words that appear most frequently
1) Focus on Page Category with the higher rate of comments [in terms of sum of target variable]
def plot_treemap(data,col,indication):
fig = px.treemap(data, path=["Page Category"], values=col, height=700,
title="Most commented topics - in terms of "+indication, color_discrete_sequence = px.colors.qualitative.Dark2)
fig.data[0].textinfo = 'label+text+value'
fig.show()
plot_treemap(datafb1_resume,'Target Variable',"sum")
2) Focus on Page Category with the higher rate of comments in general [in terms of mean of target variable]
test=datafb1_resume.groupby('Page Category').mean()
test.reset_index(inplace=True)
plot_treemap(test,'Target Variable',"mean")
Interpretation :
In terms of the sum of comments, the page category that reaches the first rank is the professional sports team with a total number of comments equal to 94,339. However, if we look in general and in terms of average on the data set, the ranking is different.
4) Radar of the top 3 Categories
Top3=datafb1_resume[datafb1_resume['Page Category'].isin(['Politician','Tv network','Tv show'])]
radar_chart = pygal.Radar()
radar_chart.title = "Main caracteristics Top 3 Categories' Pages (in terms of average)"
radar_chart.x_labels = ['CC1',
'CC2', 'CC3', 'CC4', 'CC5', 'Base time', 'Post length',
'Post Share Count', 'Post Promotion Status', 'H Local',
'Target Variable']
radar_chart.add('Politician',Top3[Top3['Page Category']=='Politician'].mean().values[3:])
radar_chart.add('Tv network', Top3[Top3['Page Category']=='Tv network'].mean().values[3:])
radar_chart.add('Tv show', Top3[Top3['Page Category']=='Tv show'].mean().values[3:])
C:\Users\alixp\AppData\Local\Temp/ipykernel_14768/1801449329.py:9: FutureWarning: Dropping of nuisance columns in DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError. Select only valid columns before calling the reduction. C:\Users\alixp\AppData\Local\Temp/ipykernel_14768/1801449329.py:10: FutureWarning: Dropping of nuisance columns in DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError. Select only valid columns before calling the reduction. C:\Users\alixp\AppData\Local\Temp/ipykernel_14768/1801449329.py:11: FutureWarning: Dropping of nuisance columns in DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError. Select only valid columns before calling the reduction.
--------------------------------------------------------------------------- OSError Traceback (most recent call last) ~\anaconda3\lib\site-packages\IPython\core\formatters.py in __call__(self, obj) 343 method = get_real_method(obj, self.print_method) 344 if method is not None: --> 345 return method() 346 return None 347 else: ~\anaconda3\lib\site-packages\pygal\graph\base.py in _repr_png_(self) 233 def _repr_png_(self): 234 """Display png in IPython notebook""" --> 235 return self.render_to_png() ~\anaconda3\lib\site-packages\pygal\graph\public.py in render_to_png(self, filename, dpi, **kwargs) 116 def render_to_png(self, filename=None, dpi=72, **kwargs): 117 """Render the graph, convert it to png and write it to filename""" --> 118 import cairosvg 119 return cairosvg.svg2png( 120 bytestring=self.render(**kwargs), write_to=filename, dpi=dpi ~\anaconda3\lib\site-packages\cairosvg\__init__.py in <module> 24 25 # VERSION is used in the "url" module imported by "surface" ---> 26 from . import surface # noqa isort:skip 27 28 ~\anaconda3\lib\site-packages\cairosvg\surface.py in <module> 7 import io 8 ----> 9 import cairocffi as cairo 10 11 from .colors import color, negate_color ~\anaconda3\lib\site-packages\cairocffi\__init__.py in <module> 46 47 ---> 48 cairo = dlopen( 49 ffi, ('cairo-2', 'cairo', 'libcairo-2'), 50 ('libcairo.so.2', 'libcairo.2.dylib', 'libcairo-2.dll')) ~\anaconda3\lib\site-packages\cairocffi\__init__.py in dlopen(ffi, library_names, filenames) 43 error_message = '\n'.join( # pragma: no cover 44 str(exception) for exception in exceptions) ---> 45 raise OSError(error_message) # pragma: no cover 46 47 OSError: no library called "cairo-2" was found no library called "cairo" was found no library called "libcairo-2" was found cannot load library 'libcairo.so.2': error 0x7e cannot load library 'libcairo.2.dylib': error 0x7e cannot load library 'libcairo-2.dll': error 0x7e
Top3bis=datafb1_resume[datafb1_resume['Page Category'].isin(['Professional sports team','Artist','Musician/band'])]
radar_chart2 = pygal.Radar()
radar_chart2.title = "Main caracteristics Top 3 Categories' Pages (in terms of sum)"
radar_chart2.x_labels = ['CC1',
'CC2', 'CC3', 'CC4', 'CC5', 'Base time', 'Post length',
'Post Share Count', 'Post Promotion Status', 'H Local',
'Target Variable']
radar_chart2.add('Professional sports team', Top3bis[Top3bis['Page Category']=='Professional sports team'].mean().values[3:])
radar_chart2.add('Artist', Top3bis[Top3bis['Page Category']=='Artist'].mean().values[3:])
radar_chart2.add('Musician/band', Top3bis[Top3bis['Page Category']=='Musician/band'].mean().values[3:])
C:\Users\alixp\AppData\Local\Temp/ipykernel_14768/1928563716.py:9: FutureWarning: Dropping of nuisance columns in DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError. Select only valid columns before calling the reduction. C:\Users\alixp\AppData\Local\Temp/ipykernel_14768/1928563716.py:10: FutureWarning: Dropping of nuisance columns in DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError. Select only valid columns before calling the reduction. C:\Users\alixp\AppData\Local\Temp/ipykernel_14768/1928563716.py:11: FutureWarning: Dropping of nuisance columns in DataFrame reductions (with 'numeric_only=None') is deprecated; in a future version this will raise TypeError. Select only valid columns before calling the reduction.
--------------------------------------------------------------------------- OSError Traceback (most recent call last) ~\anaconda3\lib\site-packages\IPython\core\formatters.py in __call__(self, obj) 343 method = get_real_method(obj, self.print_method) 344 if method is not None: --> 345 return method() 346 return None 347 else: ~\anaconda3\lib\site-packages\pygal\graph\base.py in _repr_png_(self) 233 def _repr_png_(self): 234 """Display png in IPython notebook""" --> 235 return self.render_to_png() ~\anaconda3\lib\site-packages\pygal\graph\public.py in render_to_png(self, filename, dpi, **kwargs) 116 def render_to_png(self, filename=None, dpi=72, **kwargs): 117 """Render the graph, convert it to png and write it to filename""" --> 118 import cairosvg 119 return cairosvg.svg2png( 120 bytestring=self.render(**kwargs), write_to=filename, dpi=dpi ~\anaconda3\lib\site-packages\cairosvg\__init__.py in <module> 24 25 # VERSION is used in the "url" module imported by "surface" ---> 26 from . import surface # noqa isort:skip 27 28 ~\anaconda3\lib\site-packages\cairosvg\surface.py in <module> 7 import io 8 ----> 9 import cairocffi as cairo 10 11 from .colors import color, negate_color ~\anaconda3\lib\site-packages\cairocffi\__init__.py in <module> 46 47 ---> 48 cairo = dlopen( 49 ffi, ('cairo-2', 'cairo', 'libcairo-2'), 50 ('libcairo.so.2', 'libcairo.2.dylib', 'libcairo-2.dll')) ~\anaconda3\lib\site-packages\cairocffi\__init__.py in dlopen(ffi, library_names, filenames) 43 error_message = '\n'.join( # pragma: no cover 44 str(exception) for exception in exceptions) ---> 45 raise OSError(error_message) # pragma: no cover 46 47 OSError: no library called "cairo-2" was found no library called "cairo" was found no library called "libcairo-2" was found cannot load library 'libcairo.so.2': error 0x7e cannot load library 'libcairo.2.dylib': error 0x7e cannot load library 'libcairo-2.dll': error 0x7e
Interpretation :
Here we can observe the combination and distribution of the different components that make up the pages of the top 3 page categories (in terms of sum or average) as studied above.
1) Visualize the repartition of page checkings according to the weekday
plt.rcParams.update({'font.size': 10})
check_day=pd.DataFrame(datafb1_resume.groupby('weekday')['Page Checkings'].mean())
#create pie chart
clrs =['tomato' if x == max(check_day['Page Checkings'])else 'darkseagreen' for x in check_day['Page Checkings']]
fig = plt.figure(figsize=(5, 5))
plt.pie(check_day['Page Checkings'], labels= check_day.index, colors=clrs, autopct='%.f%%', textprops={'fontsize': 14})
plt.title("Avg of page checkings according to the weekday",fontsize=20)
plt.show()
plt.show()
Interpretation :
</br>
Saturday seems to be the day when most of the consumers use Facebook and visited pages. The average number of pages checkings is around 16%, i.e. 1 to 3% higher than for the other daysThe average of page chechings is around 16%.
2) Visualization of the average number of comments according to weekday
import matplotlib.pyplot as plt
import seaborn as sns
fig, ax =plt.subplots(figsize=(20,10))
color=['cadetblue','Greens_d']
clrs = ['steelblue' if (x < max(comment_day['Target Variable'])) else 'orangered' for x in comment_day['Target Variable'] ]
clrs2 = ['steelblue' if (x > min (comment_day['Target Variable'])) else 'seagreen' for x in comment_day['Target Variable'] ]
clrs[clrs2.index('seagreen')]="seagreen"
plt=sns.barplot(x=comment_day.index, y=comment_day['Target Variable'], data=comment_day, palette=clrs)
plt.set_title('Avg of comment per weekday',fontsize=20)
ax.set_xlabel('Weekday',fontsize=20)
ax.set_ylabel('Target Variable',fontsize=20)
for i, v in enumerate(comment_day['Target Variable'].iteritems()):
ax.text(i ,v[1], "{:.2f} comments".format(v[1]), color='black', va ='top',
rotation=90,fontsize=20)
Interpretation :
This graph shows that, on average, the best day to post should be Wednesday, as it appears that the customer activity of posting reviews is higher than on other days. The minimum number of comments posted concerns Saturday with an average of 6.23 comments per post.
3) Visualization of the top "page talking about" depending on week day and category
df2=pd.DataFrame(datafb1_resume.groupby(['Page Category','weekday'])['Page talking about'].mean())
df2.sort_values('Page talking about',ascending=False)
df2.reset_index(inplace=True)
df2
fig = px.sunburst(df2[0:40], path=['weekday',"Page Category"], values='Page talking about',title="Main 'page talking about' depending on the day and the category")
fig.show()
4) Impact of the Weekday on the number of share post
Top3=datafb1_resume[datafb1_resume['Page Category'].isin(['Professional sports team','Artist','Musician/band'])]
Top3bis=datafb1_resume[datafb1_resume["Page Category"].isin(['Professional sports team','Artist','Musician/band'])]
col=['Light reddish violet','stealblue','green']
f, ax = plt.subplots(figsize=(15, 10))
sns.boxplot(x='weekday', y='Post Share Count',hue="Page Category",data=Top3[Top3["Post Share Count"]<400], palette="Set2")
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.,fontsize=20)
ax.set_xlabel("Weekday", fontsize=20)
ax.set_ylabel("Post Share Count", fontsize=20)
f, ax = plt.subplots(figsize=(15, 10))
sns.boxplot(x='weekday', y='Page talking about',hue="Page Category",data=Top3bis[Top3bis['Page talking about']<100000], palette="Set1")
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.,fontsize=20)
ax.set_xlabel("Weekday", fontsize=20)
ax.set_ylabel("Page talking about", fontsize=20)
Text(0, 0.5, 'Page talking about')
Interpretation :
On the first graph we understand that in average, the Professional sport team Page category have higher number of share count. This could explain why it is the first category page on ranking of target variable.
</br>
Regarding the second graph, we also see that the statistics of "Page talking about " for the Professional sports team page category (i.e. the 1st quartile, median, 3rd quartile) are really higher than the two other categories pages described on the graph. This also enforces the hypothesis made with the first graph.
5) Influence of the weekday on the number of post sharing and the number of post's comments According to the category page
datafb1scatt=datafb1_resume[datafb1_resume["Target Variable"]!=0]
px.scatter(datafb1scatt, x="Target Variable", y="Post Share Count", animation_frame="Page Category", animation_group="weekday",
size="Post Share Count", color="weekday", hover_name="weekday",
size_max=1000,range_y=[0,400],range_x=[0,1000],opacity=1)
1) Correlation between the length of a post and its number of comment
datafb_popularity=datafb1_resume.groupby('Post length')["Target Variable"].mean().reset_index(level=None, drop=False,inplace=False)
px.line(datafb_popularity,x='Post length',y='Target Variable', title ="Correlation between the length of a post and its number of comment")
Interpretation :
</br>
We observe that the post's length that generates the higher number of comments resids in a plage between 1000 and 4800 characters. This may indicate that the shorter the post is and the higher the number of lectors would react to it.
A long post is less susceptible to being "popular" in terms of the number of comments. this is certainly due to the fact that lectors do not have time to read and react to these types of posts.
2.1) Impact of the length and the share count on target variable
import plotly.express as px
datafb1scatt=datafb1_resume[datafb1_resume["Post Share Count"]<100000]
fig = px.scatter(datafb1scatt, x="Post length", y="Target Variable",
color="Post Share Count", size="Post Share Count",
log_x=True, size_max=60, title="Impact of the lenght and share count of a post on the Target variable")
fig.show()
Interpretation :
</br>
Regarding the graph above, we observe the impact of the length of a post and its distribution on Facebook (its number of shares) on the number of resulting comments. As surprising as it may seem, we observe a large number of posts for which the number of shares is important but which does not present a significant number of comments in comparison. One way to explain this could be the presence of posts for which comments could be disabled.
2.2) Impact of the length and the share count on target variable when removing row without target variable
import plotly.express as px
datafb1scatt=datafb1_resume[datafb1_resume["Post Share Count"]<100000]
datafb1scatt=datafb1scatt[datafb1scatt["Target Variable"]!=0]
fig = px.scatter(datafb1scatt, x="Post length", y="Target Variable",
color="Post Share Count", size="Post Share Count",
log_x=True, size_max=60, title="Impact of the lenght and share count of a post on the Target variable")
fig.show()
2) Correlation between the page popularity and its number of comment
f, ax = plt.subplots(figsize=(10, 5))
sns.scatterplot(x=datafb1['Target Variable'],y=datafb1_resume['Page Popularity/likes'])
ax.title.set_text(" Correlation between the page popularity and its number of comment")
Interpretation :
Surprisingly, The Page popularity of a page seems not to influence the number of comments under a post.
3) Correlation between post number of share and target variable
Top3=Top3[Top3["Target Variable"]!=0]
px.scatter(Top3, x="Target Variable", y="Post Share Count", animation_frame="weekday", animation_group="Page Category",
size="Post length", color="Page Category", hover_name="Page Category",
size_max=1000,range_y=[0,400],range_x=[0,1300],opacity=1)
data0=datafb1_resume[datafb1_resume["Target Variable"]==0]
data0=data0.groupby('Page Category').count()
data0.reset_index(inplace=True)
data0['Count']=data0["Post length"]
data0
fig = px.bar(data0, x="Page Category", y="Count",title="Count and catergory of null target Variable")
fig.show()
Interpretation :
weirdly, many page categories have a high number of posts with zero target variables
# !pip install sklearn --upgrade
import sklearn
from sklearn.model_selection import train_test_split
print(sklearn.__version__)
0.24.2
X=datafb1[datafb1.columns.difference(['Target Variable','weekday'])]
y=datafb1["Target Variable"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)
# We have 5 training datasets and 5 testing datasets but I'm doing everything on this one just for now
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)
(27435, 53) (13514, 53) (27435,) (13514,)
1) Application to the data
from sklearn.decomposition import PCA
acp = PCA(svd_solver='full')
coord = acp.fit_transform(X_train)
print(acp.n_components_)
print(acp.explained_variance_ratio_) # 30% explained by the first component
53 [9.99829426e-01 1.61669461e-04 8.86816121e-06 1.78292950e-08 1.19765941e-08 2.85864736e-09 1.15989620e-09 8.39722454e-10 6.89569131e-10 2.81377099e-10 2.16238496e-10 4.42595488e-11 2.88611407e-11 2.38041177e-11 2.12676561e-11 9.11543318e-12 8.40142916e-12 7.87915841e-12 7.63627483e-12 7.39223946e-12 3.55221345e-12 2.93159839e-12 1.44812099e-12 1.13699788e-12 8.80307579e-13 4.90768061e-13 3.15786449e-13 2.20203722e-13 1.31263890e-13 9.84877718e-14 8.75259836e-14 7.52471048e-14 6.13797947e-14 3.30671670e-14 1.49022292e-14 8.69860564e-15 5.21180886e-15 4.93073114e-15 3.49832562e-15 3.42835044e-15 3.31323236e-15 3.19222858e-15 2.85724478e-15 2.80965104e-15 2.67505253e-15 2.61736222e-15 1.01123449e-15 9.59153747e-16 9.94266522e-33 9.94266522e-33 9.94266522e-33 9.94266522e-33 8.05915511e-33]
2) Scree plot
#number of observations
n = datafb1.shape[0]
#number of variables
p = datafb1.shape[1]
eigval = (n-1)/n*acp.explained_variance_
#scree plot
plt.plot(np.arange(1,p-1),eigval)
plt.title("Scree plot")
plt.ylabel("Eigen values")
plt.xlabel("Factor number")
plt.show()
Interpretation :
We can see that around the 23-26 first components, 95% of the variance is explained. We won't need the components after the 26th one but let's keep digging in order to know exactly when we reach 95%.
3) Cumsum of explained variance
#cumsum of explained variance
plt.plot(np.arange(1,p-1),np.cumsum(acp.explained_variance_ratio_))
plt.title(" Explained variance vs. # of factors", fontsize=15)
plt.ylabel("Cumsum explained variance ratio")
plt.xlabel("Factor number")
plt.show()
print(np.cumsum(acp.explained_variance_ratio_))
[0.99982943 0.9999911 0.99999996 0.99999998 0.99999999 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. ]
Interpretation :
The PCA explains 95% of the variance with the 25 first components. That means that we can use the 25 first components without loosing a lot of information. It would also make the predictions faster because we use less data.
print(pd.DataFrame(acp.components_[0:25],columns=datafb1.columns[0:53]))
train_pca = acp.transform(X_train)
test_pca = acp.transform(X_test)
Page Popularity/likes Page Checkings Page talking about Page Category \
0 -1.890845e-08 0.000002 8.443186e-07 6.580126e-07
1 -7.335948e-06 0.000514 2.318455e-04 1.665466e-04
2 -8.017615e-06 -0.000289 -6.983893e-05 -1.176745e-04
3 2.531465e-04 -0.012109 -5.119253e-03 -3.721106e-03
4 -3.234257e-07 0.066326 2.540080e-02 2.439938e-02
5 -3.461789e-04 -0.006981 -2.454002e-03 -3.049606e-03
6 1.807454e-03 0.061337 7.970177e-03 3.233303e-02
7 -2.779255e-03 -0.356211 -1.351557e-01 -1.398053e-01
8 3.242359e-03 0.490033 1.952819e-01 1.707424e-01
9 -4.629367e-02 -0.079542 4.210524e-01 -3.865646e-01
10 1.205578e-03 -0.280204 -1.751503e-01 -8.164405e-02
11 1.486298e-02 0.024899 -5.667473e-02 -7.119890e-02
12 1.057820e-01 0.178684 -3.242752e-01 -3.576529e-01
13 -1.998527e-01 -0.239870 5.077749e-01 5.488259e-01
14 1.932165e-02 0.007299 -5.378384e-02 -5.421238e-02
15 -4.884070e-02 0.007325 -2.400266e-03 3.955311e-03
16 -4.371207e-01 0.033877 -3.241719e-02 -3.074849e-02
17 -7.243380e-01 0.048134 -1.234438e-01 -9.503687e-02
18 3.543742e-01 -0.031277 7.766167e-02 6.778451e-02
19 3.195040e-01 -0.013677 7.896601e-02 7.681788e-02
20 7.398103e-04 0.012902 -2.156770e-02 -2.320225e-02
21 9.837207e-03 -0.018604 3.006015e-02 3.290324e-02
22 -1.114682e-02 0.152673 -6.074661e-03 -1.101904e-02
23 1.076791e-03 -0.101449 -1.187551e-02 -8.997314e-03
24 -9.236994e-03 0.638367 4.816710e-02 2.704450e-02
Derived_5 Derived_6 Derived_7 Derived_8 Derived_9 \
0 0.000002 1.863060e-07 2.344329e-08 0.000009 8.469235e-07
1 0.000485 6.529894e-05 6.512732e-06 0.002149 2.234520e-04
2 -0.000286 4.783560e-05 7.223755e-06 -0.001801 -4.535932e-05
3 -0.011697 -1.398147e-03 -1.224114e-04 -0.171495 -1.280636e-02
4 0.062756 1.001423e-03 -6.829393e-05 0.316434 1.933887e-02
5 -0.006606 5.956043e-04 -8.973072e-05 -0.002690 -1.469717e-03
6 0.053158 -2.436285e-02 1.798236e-03 -0.576754 9.798191e-04
7 -0.333107 4.649587e-03 -2.420824e-03 0.023914 -5.269481e-02
8 0.456054 2.453950e-02 5.087139e-03 0.029888 5.407205e-02
9 -0.033487 8.076170e-01 1.803802e-03 -0.018920 2.452695e-02
10 -0.276773 -9.350623e-02 1.118634e-02 -0.008102 1.532435e-01
11 0.019869 1.452417e-02 4.558821e-03 -0.279619 -4.952417e-02
12 0.089552 3.337768e-02 2.343330e-02 0.153522 -8.811126e-02
13 -0.178788 -4.105106e-02 1.645500e-02 0.079205 -2.361263e-02
14 0.040561 4.285408e-04 5.017164e-04 0.112965 2.871688e-02
15 -0.008000 -6.355577e-03 2.180888e-02 0.558955 7.406481e-02
16 -0.007889 -1.668699e-03 5.679639e-03 0.074025 -2.702650e-01
17 0.045099 -2.840697e-02 2.476615e-02 -0.183147 1.390757e-01
18 -0.031615 9.877164e-03 1.077192e-02 0.005142 -6.079756e-02
19 -0.057588 2.148132e-03 5.838974e-02 -0.227939 4.899265e-02
20 0.003287 1.634548e-03 1.098287e-01 0.027662 1.109776e-01
21 -0.003609 -2.843089e-03 1.658395e-02 -0.055775 -4.261893e-02
22 -0.157337 4.944376e-03 1.682689e-02 -0.023476 2.726828e-01
23 0.112467 -2.878194e-03 -3.656442e-03 0.031540 2.502336e-01
24 -0.711533 2.112260e-02 -1.241279e-02 0.007169 -4.479324e-03
Derived_10 ... Post published weekday_Thursday \
0 3.491501e-07 ... -4.204097e-10
1 1.006275e-04 ... -6.230588e-08
2 2.471198e-05 ... -1.060542e-07
3 -4.768455e-03 ... 2.697162e-06
4 5.123778e-03 ... 6.170175e-06
5 -6.173500e-04 ... 9.934749e-06
6 1.067044e-02 ... 1.593705e-05
7 -2.973181e-02 ... -9.133236e-06
8 3.447899e-02 ... -7.405921e-06
9 1.442997e-02 ... -1.446335e-05
10 9.136790e-02 ... 1.237686e-05
11 -7.703833e-04 ... 4.280568e-05
12 1.689477e-02 ... 1.698272e-04
13 6.147182e-02 ... -1.218161e-04
14 5.377818e-03 ... 2.092889e-05
15 5.718843e-02 ... -2.719646e-05
16 -1.724093e-01 ... -8.608093e-06
17 9.799519e-02 ... -1.548002e-04
18 -3.042548e-02 ... -9.210276e-05
19 7.873733e-02 ... 1.337321e-04
20 5.665559e-01 ... 1.127280e-04
21 -1.592597e-01 ... 2.701595e-05
22 4.528183e-01 ... 8.762288e-06
23 -4.831786e-02 ... 8.972524e-04
24 -1.353785e-01 ... -3.877768e-06
Post published weekday_Friday Post published weekday_Saturday \
0 8.822949e-10 7.410246e-11
1 1.742061e-08 1.804578e-07
2 5.186606e-08 1.532366e-08
3 -1.916009e-06 -5.013139e-06
4 -3.791365e-06 -1.307166e-05
5 6.442460e-06 2.660357e-07
6 1.666067e-05 -4.273044e-06
7 1.445144e-05 2.793440e-05
8 3.283992e-05 2.585737e-05
9 1.488614e-05 3.426454e-05
10 2.583056e-05 3.072690e-06
11 -9.010301e-06 -8.538562e-05
12 -1.071523e-04 -2.993305e-04
13 8.585289e-05 1.700996e-05
14 2.976830e-05 6.156380e-05
15 1.210525e-04 2.857411e-04
16 2.268022e-04 2.073973e-04
17 -9.049237e-05 5.334024e-04
18 -9.027742e-05 -1.408688e-04
19 -3.786902e-05 5.390485e-05
20 7.186696e-05 8.720875e-05
21 9.413074e-06 -2.302311e-04
22 1.244575e-04 2.600214e-04
23 -2.666206e-04 -7.882111e-04
24 2.167766e-04 -6.249814e-04
base_dt_weekday_Sunday base_dt_weekday_Monday base_dt_weekday_Tuesday \
0 -2.944958e-10 6.262044e-10 -3.891400e-10
1 -3.407223e-08 -1.957400e-08 -7.466113e-08
2 -1.436135e-07 -1.953675e-07 -5.262190e-08
3 2.294498e-06 -1.488865e-07 2.147184e-06
4 1.942328e-06 2.924473e-07 7.496048e-06
5 4.482223e-06 -3.022171e-06 4.180314e-06
6 -1.648636e-07 2.140816e-06 1.081259e-06
7 1.688062e-05 -1.251091e-05 -1.373580e-05
8 1.769032e-05 -1.008517e-05 -1.188707e-05
9 -3.047545e-05 5.666922e-06 -2.802572e-05
10 2.639485e-05 -4.974419e-05 -1.037779e-06
11 -2.900935e-05 3.258779e-05 3.914236e-05
12 -3.347327e-05 1.062958e-04 6.481502e-05
13 9.877003e-05 -6.189504e-05 -5.736951e-05
14 7.297925e-05 -6.976132e-05 2.695917e-05
15 2.652292e-05 -2.235638e-04 -1.508531e-04
16 4.816365e-05 -2.896350e-04 -3.327257e-07
17 -1.619049e-04 -3.282545e-04 -2.148022e-04
18 -2.013123e-04 2.372465e-04 1.497241e-04
19 4.246724e-05 2.284352e-04 -1.087148e-04
20 -5.812576e-05 -5.545450e-05 1.582080e-05
21 2.180207e-04 -9.502555e-05 1.259977e-04
22 1.973897e-04 -2.306627e-04 -2.993238e-04
23 7.835021e-04 3.309314e-04 -8.438944e-05
24 1.371655e-04 -6.031725e-05 -4.972544e-04
base_dt_weekday_Wednesday base_dt_weekday_Thursday \
0 -5.178650e-10 4.195334e-10
1 -7.106615e-08 9.564628e-08
2 -8.075157e-08 1.471418e-07
3 1.480184e-06 -3.801635e-06
4 2.256102e-06 -9.623976e-06
5 5.076672e-06 9.061059e-06
6 -5.855786e-06 6.406682e-06
7 -1.384937e-05 1.812482e-05
8 -9.122475e-06 2.512407e-05
9 -1.101250e-05 1.830029e-05
10 3.160126e-05 6.475506e-06
11 1.198019e-05 3.934697e-05
12 1.733269e-04 -2.422391e-04
13 -1.999150e-05 -1.390272e-05
14 -6.189382e-05 5.260122e-05
15 -1.715316e-04 2.520929e-04
16 -1.193624e-04 -3.075300e-05
17 -3.373090e-04 1.550752e-04
18 4.623319e-05 6.789322e-05
19 1.561415e-04 1.423548e-04
20 -2.577681e-04 4.096765e-04
21 1.653309e-04 -4.490874e-04
22 -2.467832e-04 4.683240e-04
23 1.367025e-04 -6.532374e-04
24 3.725527e-04 2.098639e-04
base_dt_weekday_Friday base_dt_weekday_Saturday
0 8.152308e-11 7.423993e-11
1 4.169265e-08 6.203457e-08
2 8.829036e-08 2.369224e-07
3 1.843728e-06 -3.815072e-06
4 9.637344e-07 -3.326684e-06
5 -1.148124e-05 -8.296863e-06
6 -1.840270e-05 1.479459e-05
7 -1.204138e-05 1.713203e-05
8 1.395794e-06 -1.311547e-05
9 1.511116e-05 3.043530e-05
10 1.810274e-05 -3.179238e-05
11 -4.233214e-05 -5.171582e-05
12 8.808087e-05 -1.568062e-04
13 7.593920e-05 -2.155046e-05
14 -1.095922e-05 -9.925280e-06
15 2.968016e-05 2.376526e-04
16 -3.020278e-05 4.221222e-04
17 2.937492e-04 5.934463e-04
18 -1.553828e-04 -1.444019e-04
19 -2.463520e-04 -2.143319e-04
20 -1.801727e-04 1.260238e-04
21 -5.689107e-06 4.045291e-05
22 1.557403e-04 -4.468436e-05
23 -4.136999e-05 -4.721391e-04
24 1.206539e-04 -2.826643e-04
[25 rows x 53 columns]
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
from sklearn import svm
svr = svm.SVR(kernel='linear')
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import GridSearchCV
from sklearn import svm
from sklearn.model_selection import learning_curve
cross_val_score(svr, X_train, y_train)
#array([0.23037456, 0.25518354, 0.27412064, 0.24950104, 0.2338662 ]) ce n'est pas très bon
parameters = { 'gamma' : [0.01, 0.1] } # and 0.5
grid = GridSearchCV(svm.SVR(), parameters, n_jobs=-1, cv=5)
grid.fit(X_train, y_train)
print (grid.best_score_, grid.best_estimator_)
#0.15606157440610818 SVR(gamma=0.01)
N,train_scores,val_scores=learning_curve(svm.SVR(),X_train,y_train,train_sizes=np.linspace(0.1,1,10),cv=5)
plt.plot(N,train_scores.mean(axis=1),label="train", title ="tets")
plt.plot(N,val_scores.mean(axis=1),label="validation")
plt.set_til
plt.legend()
--------------------------------------------------------------------------- KeyboardInterrupt Traceback (most recent call last) ~\AppData\Local\Temp/ipykernel_14768/1038681741.py in <module> ----> 1 N,train_scores,val_scores=learning_curve(svm.SVR(),X_train,y_train,train_sizes=np.linspace(0.1,1,10),cv=5) 2 plt.plot(N,train_scores.mean(axis=1),label="train", title ="tets") 3 plt.plot(N,val_scores.mean(axis=1),label="validation") 4 plt.set_til 5 plt.legend() ~\anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs) 61 extra_args = len(args) - len(all_args) 62 if extra_args <= 0: ---> 63 return f(*args, **kwargs) 64 65 # extra_args > 0 ~\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py in learning_curve(estimator, X, y, groups, train_sizes, cv, scoring, exploit_incremental_learning, n_jobs, pre_dispatch, verbose, shuffle, random_state, error_score, return_times, fit_params) 1410 train_test_proportions.append((train[:n_train_samples], test)) 1411 -> 1412 results = parallel(delayed(_fit_and_score)( 1413 clone(estimator), X, y, scorer, train, test, verbose, 1414 parameters=None, fit_params=fit_params, return_train_score=True, ~\anaconda3\lib\site-packages\joblib\parallel.py in __call__(self, iterable) 1044 self._iterating = self._original_iterator is not None 1045 -> 1046 while self.dispatch_one_batch(iterator): 1047 pass 1048 ~\anaconda3\lib\site-packages\joblib\parallel.py in dispatch_one_batch(self, iterator) 859 return False 860 else: --> 861 self._dispatch(tasks) 862 return True 863 ~\anaconda3\lib\site-packages\joblib\parallel.py in _dispatch(self, batch) 777 with self._lock: 778 job_idx = len(self._jobs) --> 779 job = self._backend.apply_async(batch, callback=cb) 780 # A job can complete so quickly than its callback is 781 # called before we get here, causing self._jobs to ~\anaconda3\lib\site-packages\joblib\_parallel_backends.py in apply_async(self, func, callback) 206 def apply_async(self, func, callback=None): 207 """Schedule a func to be run""" --> 208 result = ImmediateResult(func) 209 if callback: 210 callback(result) ~\anaconda3\lib\site-packages\joblib\_parallel_backends.py in __init__(self, batch) 570 # Don't delay the application, to avoid keeping the input 571 # arguments in memory --> 572 self.results = batch() 573 574 def get(self): ~\anaconda3\lib\site-packages\joblib\parallel.py in __call__(self) 260 # change the default number of processes to -1 261 with parallel_backend(self._backend, n_jobs=self._n_jobs): --> 262 return [func(*args, **kwargs) 263 for func, args, kwargs in self.items] 264 ~\anaconda3\lib\site-packages\joblib\parallel.py in <listcomp>(.0) 260 # change the default number of processes to -1 261 with parallel_backend(self._backend, n_jobs=self._n_jobs): --> 262 return [func(*args, **kwargs) 263 for func, args, kwargs in self.items] 264 ~\anaconda3\lib\site-packages\sklearn\utils\fixes.py in __call__(self, *args, **kwargs) 220 def __call__(self, *args, **kwargs): 221 with config_context(**self.config): --> 222 return self.function(*args, **kwargs) ~\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, return_n_test_samples, return_times, return_estimator, split_progress, candidate_progress, error_score) 626 score_time = time.time() - start_time - fit_time 627 if return_train_score: --> 628 train_scores = _score( 629 estimator, X_train, y_train, scorer, error_score 630 ) ~\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py in _score(estimator, X_test, y_test, scorer, error_score) 685 scores = scorer(estimator, X_test) 686 else: --> 687 scores = scorer(estimator, X_test, y_test) 688 except Exception: 689 if error_score == 'raise': ~\anaconda3\lib\site-packages\sklearn\metrics\_scorer.py in _passthrough_scorer(estimator, *args, **kwargs) 395 def _passthrough_scorer(estimator, *args, **kwargs): 396 """Function that wraps estimator.score""" --> 397 return estimator.score(*args, **kwargs) 398 399 ~\anaconda3\lib\site-packages\sklearn\base.py in score(self, X, y, sample_weight) 551 552 from .metrics import r2_score --> 553 y_pred = self.predict(X) 554 return r2_score(y, y_pred, sample_weight=sample_weight) 555 ~\anaconda3\lib\site-packages\sklearn\svm\_base.py in predict(self, X) 342 X = self._validate_for_predict(X) 343 predict = self._sparse_predict if self._sparse else self._dense_predict --> 344 return predict(X) 345 346 def _dense_predict(self, X): ~\anaconda3\lib\site-packages\sklearn\svm\_base.py in _dense_predict(self, X) 359 svm_type = LIBSVM_IMPL.index(self._impl) 360 --> 361 return libsvm.predict( 362 X, self.support_, self.support_vectors_, self._n_support, 363 self._dual_coef_, self._intercept_, KeyboardInterrupt:
from sklearn.linear_model import Lasso
algo = Lasso()
params = {"max_iter" : [ 1000],
"alpha" : [0.1],
"selection": ["random", "cyclic"]}
grid = GridSearchCV(algo, params, n_jobs=-1)
grid.fit(X_train, y_train)
print (grid.best_score_, grid.best_estimator_)
#0.23363314379533154 Lasso(alpha=0.1, selection='random')
N,train_scores,val_scores=learning_curve(Lasso(alpha=0.1, selection='random'),X_train,y_train,train_sizes=np.linspace(0.1,1,10),cv=5)
plt.plot(N,train_scores.mean(axis=1),label="train")
plt.plot(N,val_scores.mean(axis=1),label="validation")
plt.legend()
C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:530: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 5297.554337582784, tolerance: 332.3247345487694 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:530: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1623.852473219391, tolerance: 521.281727546138 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:530: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1528.4704678000417, tolerance: 314.02897994530554 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:530: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1034.8204392367043, tolerance: 314.02897994530554 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:530: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1006.7188285693992, tolerance: 314.02897994530554 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:530: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1523.6598584863823, tolerance: 314.02897994530554
<matplotlib.legend.Legend at 0x235194ad760>
from sklearn.model_selection import GridSearchCV
def test_hyperparametres(algo, hyperparametres):
grid = GridSearchCV(algo, hyperparametres, n_jobs=-1)
grid.fit(X_train, y_train)
print (grid.best_score_, grid.best_estimator_)
return grid.best_score_, grid.best_estimator_
test_hyperparametres(Lasso(),{"max_iter" : [ 1000],
"alpha" : [1.5,0.5,1,2],
"selection": ["random", "cyclic"]})
#0.29186614237615527 Lasso(alpha=1.5, selection='random')
from sklearn.linear_model import Ridge
test_hyperparametres(Ridge(),{"max_iter" : [3000, 1000],
"alpha" : [0.5,1,2,100] })
# 0.22737442852720005 Ridge(alpha=100, max_iter=3000)
# (0.22737442852720005, Ridge(alpha=100, max_iter=3000))
from sklearn.linear_model import RidgeCV
test_hyperparametres(RidgeCV(),{})
# 0.22156061038994976 RidgeCV(alphas=array([ 0.1, 1. , 10. ]))
# (0.22156061038994976, RidgeCV(alphas=array([ 0.1, 1. , 10. ])))
from sklearn.linear_model import Ridge
N,train_scores,val_scores=learning_curve(Ridge(alpha=100, max_iter=3000),X_train,y_train,train_sizes=np.linspace(0.1,1,10),cv=5)
plt.plot(N,train_scores.mean(axis=1),label="train")
plt.plot(N,val_scores.mean(axis=1),label="validation")
plt.legend()
<matplotlib.legend.Legend at 0x23519567dc0>
from sklearn.linear_model import ElasticNetCV
test_hyperparametres(ElasticNetCV(),{"l1_ratio":[0.5,0.25,0.75],"n_alphas":[100,150]})
#0.3215338376167568 ElasticNetCV(l1_ratio=0.75)
from sklearn.linear_model import ElasticNetCV
N,train_scores,val_scores=learning_curve(ElasticNetCV(l1_ratio=0.75),X_train,y_train,train_sizes=np.linspace(0.1,1,10),cv=5)
plt.plot(N,train_scores.mean(axis=1),label="train")
plt.plot(N,val_scores.mean(axis=1),label="validation")
plt.legend()
C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 347.2231424229685, tolerance: 279.6964748717949 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 464.3615406670142, tolerance: 279.6964748717949 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1172.3735212604515, tolerance: 449.4300079179723 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2226.4851167434826, tolerance: 449.4300079179723 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2778.6571623864584, tolerance: 449.4300079179723 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3168.021041939035, tolerance: 449.4300079179723 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1037.6280629737303, tolerance: 394.8200811734551 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1730.0827367771417, tolerance: 394.8200811734551 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2223.8122564468067, tolerance: 394.8200811734551 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2152.7467195705976, tolerance: 394.8200811734551 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1834.0176194079686, tolerance: 394.8200811734551 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 480.8865563017316, tolerance: 431.6641923668472 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 908.7161304336041, tolerance: 431.6641923668472 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1332.040090532042, tolerance: 431.6641923668472 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1633.9214386125095, tolerance: 431.6641923668472 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2092.125634649303, tolerance: 431.6641923668472 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2372.000360697508, tolerance: 431.6641923668472 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1310.714808923658, tolerance: 457.07116936218705 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2163.5014153695665, tolerance: 457.07116936218705 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2684.5123527715914, tolerance: 457.07116936218705 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2670.940993218217, tolerance: 457.07116936218705 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 645.4607567111962, tolerance: 473.9613726979304 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 937.010862802621, tolerance: 473.9613726979304 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1173.6907856003381, tolerance: 473.9613726979304 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1484.6318083484657, tolerance: 473.9613726979304 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1681.154852415435, tolerance: 473.9613726979304 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 926.9148775124922, tolerance: 512.2420906018607 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1294.2727645048872, tolerance: 512.2420906018607 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1672.4182353676297, tolerance: 512.2420906018607 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1842.40647342382, tolerance: 512.2420906018607 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1777.0352178774774, tolerance: 512.2420906018607 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1881.1773410150781, tolerance: 512.2420906018607 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 986.3074859553017, tolerance: 559.0163466869185 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1671.4991434598342, tolerance: 559.0163466869185 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2256.3489443119615, tolerance: 559.0163466869185 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2974.7630651085638, tolerance: 559.0163466869185 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3497.3430755543523, tolerance: 559.0163466869185 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3382.608626689762, tolerance: 559.0163466869185 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2950.8904905919917, tolerance: 559.0163466869185 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3187.828115563374, tolerance: 559.0163466869185 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 825.9654086632654, tolerance: 592.4521659832953 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1408.2651015729643, tolerance: 592.4521659832953 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1932.8713517547585, tolerance: 592.4521659832953 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2541.4832066018134, tolerance: 592.4521659832953 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3272.6141794431023, tolerance: 592.4521659832953 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3622.4333466487005, tolerance: 592.4521659832953 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3557.7426299815997, tolerance: 592.4521659832953 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3362.9327093050815, tolerance: 592.4521659832953 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1089.8884450076148, tolerance: 754.6116801679955 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1537.494816860184, tolerance: 754.6116801679955 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2145.777036845684, tolerance: 754.6116801679955 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2606.1598486807197, tolerance: 754.6116801679955 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3003.45534382388, tolerance: 754.6116801679955 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1723.7127398382872, tolerance: 1531.2593607244562 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1957.2895798198879, tolerance: 1221.8712783004898 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 4848.254037476145, tolerance: 1221.8712783004898 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 7460.610247780569, tolerance: 1221.8712783004898 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 6952.296051971614, tolerance: 1221.8712783004898 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 8362.42261996679, tolerance: 1221.8712783004898 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3311.1163467485458, tolerance: 1533.9030189861405 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3089.874849807471, tolerance: 1533.9030189861405 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 4131.738767065108, tolerance: 1533.9030189861405 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2231.790467424318, tolerance: 1676.3580992595405 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2894.4743386767805, tolerance: 1676.3580992595405 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2833.9722417481244, tolerance: 1411.2340144769319 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 5252.076984541491, tolerance: 1411.2340144769319 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 6999.438617870212, tolerance: 1411.2340144769319 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 8020.166560521349, tolerance: 1411.2340144769319 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 6893.532730547711, tolerance: 1411.2340144769319 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 5936.660393850878, tolerance: 1411.2340144769319 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 12822.462960066274, tolerance: 1411.2340144769319 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 19309.592965137213, tolerance: 1411.2340144769319 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1822.8483565691859, tolerance: 1372.64327908869 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2635.121203176677, tolerance: 1372.64327908869 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3487.666823754087, tolerance: 1831.531573199902 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3026.4676789827645, tolerance: 1831.531573199902 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 5141.814803371206, tolerance: 1831.531573199902 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2002.3024202976376, tolerance: 1948.7967740851486 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2881.0084279794246, tolerance: 2108.054051039441 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3205.82492351532, tolerance: 2108.054051039441 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3405.874193955213, tolerance: 2108.054051039441 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3138.378608327359, tolerance: 2108.054051039441 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2184.4137111753225, tolerance: 1967.4191146436954 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2368.572245582938, tolerance: 1967.4191146436954 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2177.0612260252237, tolerance: 2099.0306561954185 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2442.4680789895356, tolerance: 2294.7723666772144 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3936.518393272534, tolerance: 2294.7723666772144 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3363.5467825699598, tolerance: 2405.872477601232 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3823.323918128386, tolerance: 2405.872477601232 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 303.6692972916644, tolerance: 234.73900763532768 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 413.0724655010272, tolerance: 252.64398153846173 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 633.9117336433847, tolerance: 252.64398153846173 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 780.2366830047686, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 892.1202107875142, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 884.2602698120754, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 811.5766523920465, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2067.908580366755, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3314.613811312476, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3821.7552164148074, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3863.171299995389, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 4981.408550209831, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 277.9369730758481, tolerance: 258.83150076901177 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 400.4793113942724, tolerance: 258.83150076901177 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 528.3692598761991, tolerance: 346.2983565365993 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 686.8698947899975, tolerance: 409.227404063034 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1241.5312220533378, tolerance: 409.227404063034 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1314.2517857560888, tolerance: 409.227404063034 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1523.5318165454082, tolerance: 409.227404063034 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1372.376777104102, tolerance: 893.9655837676211 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2212.121073514223, tolerance: 893.9655837676211 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2806.987228134647, tolerance: 893.9655837676211 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3678.047205203213, tolerance: 893.9655837676211 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 4332.458661866374, tolerance: 893.9655837676211 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3865.253098247573, tolerance: 893.9655837676211 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1652.9278723299503, tolerance: 1437.0382559426614 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2070.348177390173, tolerance: 1668.0993316593886 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1485.205832529813, tolerance: 1292.559467790014 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1988.852933904156, tolerance: 1292.559467790014 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2516.3553577195853, tolerance: 1292.559467790014 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2674.124694553204, tolerance: 1292.559467790014 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2258.367623668164, tolerance: 1292.559467790014 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 303.6692972916644, tolerance: 234.73900763532768 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 413.0724655010272, tolerance: 252.64398153846173 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 633.9117336433847, tolerance: 252.64398153846173 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 780.2366830047686, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 892.1202107875142, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 884.2602698120754, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 811.5766523920465, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2067.908580366755, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3314.613811312476, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3821.7552164148074, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3863.171299995389, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 4981.408550209831, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 277.9369730758481, tolerance: 258.83150076901177 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 400.4793113942724, tolerance: 258.83150076901177 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 528.3692598761991, tolerance: 346.2983565365993 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 565.5988690415397, tolerance: 451.6675955572432 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 685.7950049727224, tolerance: 656.6361521856754 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 943.8819899060763, tolerance: 656.6361521856754 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1257.6513894437812, tolerance: 780.1826796218248 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1130.6204879973084, tolerance: 922.7440475128158 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1636.9244227325544, tolerance: 922.7440475128158 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2161.6774511085823, tolerance: 922.7440475128158 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2833.0464435024187, tolerance: 922.7440475128158 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3218.109746357426, tolerance: 922.7440475128158 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 4471.203498844989, tolerance: 922.7440475128158 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1273.2315299427137, tolerance: 1146.9812584540277 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 4699.507922941819, tolerance: 1398.5159340452808 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 8991.315012423322, tolerance: 1398.5159340452808 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 303.6692972916644, tolerance: 234.73900763532768 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 413.0724655010272, tolerance: 252.64398153846173 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 633.9117336433847, tolerance: 252.64398153846173 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 780.2366830047686, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 892.1202107875142, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 884.2602698120754, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 811.5766523920465, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2067.908580366755, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3314.613811312476, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3821.7552164148074, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3863.171299995389, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 4981.408550209831, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 277.9369730758481, tolerance: 258.83150076901177 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 400.4793113942724, tolerance: 258.83150076901177 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 528.3692598761991, tolerance: 346.2983565365993 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 565.5988690415397, tolerance: 451.6675955572432 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 685.7950049727224, tolerance: 656.6361521856754 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 943.8819899060763, tolerance: 656.6361521856754 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1257.6513894437812, tolerance: 780.1826796218248 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1685.4882695432752, tolerance: 1564.850066631408 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 303.6692972916644, tolerance: 234.73900763532768 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 413.0724655010272, tolerance: 252.64398153846173 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 633.9117336433847, tolerance: 252.64398153846173 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 780.2366830047686, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 892.1202107875142, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 884.2602698120754, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 811.5766523920465, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2067.908580366755, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3314.613811312476, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3821.7552164148074, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3863.171299995389, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 4981.408550209831, tolerance: 298.60651088319094 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 277.9369730758481, tolerance: 258.83150076901177 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 400.4793113942724, tolerance: 258.83150076901177 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 528.3692598761991, tolerance: 346.2983565365993 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 565.5988690415397, tolerance: 451.6675955572432 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 685.7950049727224, tolerance: 656.6361521856754 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 943.8819899060763, tolerance: 656.6361521856754 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1257.6513894437812, tolerance: 780.1826796218248 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1685.4882695432752, tolerance: 1564.850066631408 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2579.1310256682336, tolerance: 1918.3968242132996 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2609.063833048567, tolerance: 1951.5912514666093 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2291.9187247212976, tolerance: 1743.228576635581 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2874.1010744106025, tolerance: 1743.228576635581 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 2761.6689238678664, tolerance: 2355.9073090329207 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 3897.871625872329, tolerance: 2108.9806090215284 C:\Users\alixp\anaconda3\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:526: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 5258.753843193874, tolerance: 2108.9806090215284
<matplotlib.legend.Legend at 0x235195d2e80>
from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import make_regression
algorf=RandomForestRegressor()
parametersrf={"max_depth" : [ 10],
"random_state" : [0],"min_samples_split" : [6]}
gridrf = GridSearchCV(algorf, parametersrf, n_jobs=-1)
gridrf.fit(train_pca, y_train)
print (gridrf.best_score_, gridrf.best_estimator_)
# 0.3381610958414715 RandomForestRegressor(max_depth=10, min_samples_split=6, random_state=0)
from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import make_regression
from sklearn.model_selection import GridSearchCV
algorf=RandomForestRegressor()
parametersrf={"max_depth" : [15],
"random_state" : [0],"min_samples_split" : [6]}
gridrf = GridSearchCV(algorf, parametersrf, n_jobs=-1)
gridrf.fit(X_train, y_train)
print (gridrf.best_score_, gridrf.best_estimator_)
#0.6059586096656977 RandomForestRegressor(max_depth=15, min_samples_split=6, random_state=0)
from sklearn.ensemble import RandomForestRegressor
N,train_scores,val_scores=learning_curve(RandomForestRegressor(max_depth=15, min_samples_split=6, random_state=0),X_train,y_train,train_sizes=np.linspace(0.1,1,10),cv=5)
plt.plot(N,train_scores.mean(axis=1),label="train")
plt.plot(N,val_scores.mean(axis=1),label="validation")
plt.legend()
from numpy import mean
from numpy import std
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedKFold
# fit the model on the whole dataset
parametersboost={ "loss":["squared_error"], "learning_rate":[0.1,0.5], "n_estimators":[130], "subsample":[0.5,1.0]}
algoboost=GradientBoostingRegressor()
gridboost = GridSearchCV(algoboost, parametersboost, n_jobs=-1)
gridboost.fit(X_train, y_train)
print (gridboost.best_score_, gridboost.best_estimator_)
gridboost.best_params_
#0.6180486041539913-> 0.6185435042843775-> 0.6196323958863511 GradientBoostingRegressor() {'learning_rate': 0.1,'loss': 'squared_error','n_estimators': 120,'subsample': 1.0}
from sklearn.ensemble import GradientBoostingRegressor
N,train_scores,val_scores=learning_curve(GradientBoostingRegressor('learning_rate': 0.1,'loss': 'squared_error','n_estimators': 120,'subsample': 1.0),X_train,y_train,train_sizes=np.linspace(0.1,1,10),cv=5)
plt.plot(N,train_scores.mean(axis=1),label="train")
plt.plot(N,val_scores.mean(axis=1),label="validation")
plt.legend()
from sklearn.linear_model import RidgeCV, LassoCV
from sklearn.neighbors import KNeighborsRegressor
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.ensemble import StackingRegressor
from sklearn.model_selection import GridSearchCV
estimators = [('ridge', RidgeCV()),
('lasso', LassoCV(random_state=42)),
('knr', KNeighborsRegressor(n_neighbors=20,
metric='euclidean'))]
final_estimator = GradientBoostingRegressor(n_estimators=25, subsample=0.5, min_samples_leaf=25, max_features=1, random_state=42)
reg = StackingRegressor(estimators=estimators,final_estimator=final_estimator)
reg.fit(X_train, y_train)
paramreg={}
gridreg = GridSearchCV(reg,paramreg, n_jobs=-1)
gridreg.fit(X_train, y_train)
print (gridreg.best_score_, gridreg.best_estimator_)
# 0.33
from sklearn import datasets
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import VotingClassifier
from sklearn.model_selection import GridSearchCV
clf1 = LogisticRegression(random_state=1)
clf2 = RandomForestClassifier(random_state=1)
clf3 = GaussianNB()
eclf = VotingClassifier(
estimators=[('lr', clf1), ('rf', clf2), ('gnb', clf3)],
voting='soft')
params = {'lr__C': [1.0, 10.0], 'rf__n_estimators': [20, 100]}
grid = GridSearchCV(estimator=eclf, param_grid=params, cv=5)
grid = grid.fit(X_train, y_train)
print (grid.best_score_, grid.best_estimator_)
#0.489119737561509 VotingClassifier(estimators=[('lr', LogisticRegression(random_state=1)),
#('rf',RandomForestClassifier(n_estimators=20,random_state=1)),('gnb', GaussianNB())],voting='soft')
from sklearn.model_selection import learning_curve
from sklearn.ensemble import RandomForestRegressor
from sklearn.linear_model import ElasticNetCV
from sklearn.linear_model import RidgeCV
from sklearn.linear_model import Ridge
from sklearn.linear_model import Lasso
from sklearn import svm
from sklearn.ensemble import GradientBoostingRegressor
Nsvm,train_scoressvm,val_scoressvm=learning_curve(svm.SVR(),X_train,y_train,train_sizes=np.linspace(0.1,1,10),cv=5)
NLasso,train_scoresLasso,val_scoresLasso=learning_curve(Lasso(alpha=0.1, selection='random'),X_train,y_train,train_sizes=np.linspace(0.1,1,10),cv=5)
Nelast,train_scoreselast,val_scoreselast=learning_curve(ElasticNetCV(l1_ratio=0.75),X_train,y_train,train_sizes=np.linspace(0.1,1,10),cv=5)
Nrfr,train_scoresrfr,val_scoresrfr=learning_curve(RandomForestRegressor(max_depth=15, min_samples_split=6, random_state=0),X_train,y_train,train_sizes=np.linspace(0.1,1,10),cv=5)
Ngbr,train_scoresgbr,val_scoresgbr=learning_curve(GradientBoostingRegressor(learning_rate=0.1,loss="squared_error",n_estimators=120,subsample=1.0),X_train,y_train,train_sizes=np.linspace(0.1,1,10),cv=5)
plt.plot(Nsvm,train_scoressvm.mean(axis=1),label="train_svm",color='b')
plt.plot(Nsvm,val_scoressvm.mean(axis=1),label="validation_svm",color="navy")
plt.plot(NLasso,train_scoresLasso.mean(axis=1),label="train_lasso",color="g")
plt.plot(NLasso,val_scoresLasso.mean(axis=1),label="validation_lasso",color="darkgreen")
plt.plot(Nelast,train_scoreselast.mean(axis=1),label="train_elasticnetcv",color="orange")
plt.plot(Nelast,val_scoreselast.mean(axis=1),label="validation_elasticnetcv",color="darkorange")
plt.plot(Nrfr,train_scoresrfr.mean(axis=1),label="train_randomForest",color="aqua")
plt.plot(Nrfr,val_scoresrfr.mean(axis=1),label="validation_randoForest",color="c")
plt.plot(Ngbr,train_scoresgbr.mean(axis=1),label="train_gradientBoosting",color="red")
plt.plot(Ngbr,val_scoresgbr.mean(axis=1),label="validation_gradientBoosting",color="crimson")
plt.legend(loc='upper right', bbox_to_anchor=(0.5, 0.5,1.2,0.5))
import matplotlib.ticker as ticker
from sklearn.model_selection import validation_curve
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestRegressor
from sklearn.linear_model import ElasticNetCV
from sklearn.linear_model import RidgeCV
from sklearn.linear_model import Ridge
from sklearn.linear_model import Lasso
from sklearn import svm
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import RepeatedKFold
algos=[svm.SVR(),Lasso(),RidgeCV(),ElasticNetCV(),RandomForestRegressor(),GradientBoostingRegressor()]
params = [{ 'gamma' : [ 0.01] },{"max_iter" : [500],"alpha" : [0.1], "selection": ["random"]},{"max_iter" : [1000],
"alpha" : [0.5,2] },{"l1_ratio":[0.75]},{"max_depth" : [15],
"random_state" : [0],"min_samples_split" : [6]},{"loss":["squared_error"], "learning_rate":[0.5], "n_estimators":[120], "subsample":[1.0]} ]
performances = {}
classes_de_models_a_tester = algos
best_algorithm = 0
best_perf = 0
for k in range (0,len(algos)):
try:
algorithme = algos[k]
#grid = GridSearchCV(algorithme, params[k], n_jobs=-1)
algorithme.fit(X_train, y_train)
performance = algorithme.score(X_test, y_test)
print (performance)
if performance > best_perf:
best_algorithm = algorithme
best_perf = performance
if 0<performance and performance<1:
performances[algorithme] = [performance]
except Exception as e:
if "label" in str(e): print ("Algo de classification")
else : print (str(e)[:50])
#print ("="*30)
best_algorithm, best_perf
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)
df = pd.DataFrame(performances).T
col_name = "performance"
df.columns = [col_name]
df.performance.sort_values()
liste_des_performances = df.performance.values
gains = [0]
for indice, performance in enumerate(liste_des_performances):
if indice>0:
previous_value = liste_des_performances[indice-1]
current_value = liste_des_performances[indice]
gain = (current_value - previous_value) / previous_value
gains.append(round(gain*100, 2))
df["gains"] = gains
df.reset_index(inplace=True)
df["index"][4]=RandomForestRegressor()
df["index"][5]=GradientBoostingRegressor()
df["index"][3]=RidgeCV()
df.set_index('index',inplace=True)
#df = df.drop("gains", axis=1)
df = df.sort_values(col_name)
#ax = df.plot(rot=45, x_compat=True)
fig , axes = plt.subplots(1,1)
fig.set_size_inches(9,3)
df = df.sort_values(col_name )
axes.xaxis.set_ticklabels(df.index)
axes.xaxis.set_major_locator(ticker.MultipleLocator(1))
plt.title("Performance by model")
df.plot(rot=90, ax=axes)
fig , axes = plt.subplots(1,1)
fig.set_size_inches(9,3)
#df = df.sort_values(col_name )
axes.xaxis.set_ticklabels(df.index)
axes.xaxis.set_major_locator(ticker.MultipleLocator(1))
plt.title("Performance by model")
df.plot(kind='bar', rot=90, ax=axes)